516 research outputs found

    Jenga: Harnessing Heterogeneous Memories through Reconfigurable Cache Hierarchies

    Get PDF
    Conventional memory systems are organized as a rigid hierarchy, with multiple levels of progressively larger and slower memories. Hierarchy allows a simple, fixed design to benefit a wide range of applications, because working sets settle at the smallest (and fastest) level they fit in. However, rigid hierarchies also cause significant overheads, because each level adds latency and energy even when it does not capture the working set. In emerging systems with heterogeneous memory technologies such as stacked DRAM, these overheads often limit performance and efficiency. We propose Jenga, a reconfigurable cache hierarchy that avoids these pathologies and approaches the performance of a hierarchy optimized for each application. Jenga monitors application behavior and dynamically builds virtual cache hierarchies out of heterogeneous, distributed cache banks. Jenga uses simple hardware support and a novel software runtime to configure virtual cache hierarchies. On a 36-core CMP with a 1 GB stacked-DRAM cache, Jenga outperforms a combination of state-of-the-art techniques by 10% on average and by up to 36%, and does so while saving energy, improving system-wide energy-delay product by 29% on average and by up to 96%

    Scaling Distributed Cache Hierarchies through Computation and Data Co-Scheduling

    Get PDF
    Cache hierarchies are increasingly non-uniform, so for systems to scale efficiently, data must be close to the threads that use it. Moreover, cache capacity is limited and contended among threads, introducing complex capacity/latency tradeoffs. Prior NUCA schemes have focused on managing data to reduce access latency, but have ignored thread placement; and applying prior NUMA thread placement schemes to NUCA is inefficient, as capacity, not bandwidth, is the main constraint. We present CDCS, a technique to jointly place threads and data in multicores with distributed shared caches. We develop novel monitoring hardware that enables fine-grained space allocation on large caches, and data movement support to allow frequent full-chip reconfigurations. On a 64-core system, CDCS outperforms an S-NUCA LLC by 46% on average (up to 76%) in weighted speedup and saves 36% of system energy. CDCS also outperforms state-of-the-art NUCA schemes under different thread scheduling policies.National Science Foundation (U.S.) (Grant CCF-1318384)Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Jacobs Presidential Fellowship)United States. Defense Advanced Research Projects Agency (PERFECT Contract HR0011-13-2-0005

    Demystifying Map Space Exploration for NPUs

    Full text link
    Map Space Exploration is the problem of finding optimized mappings of a Deep Neural Network (DNN) model on an accelerator. It is known to be extremely computationally expensive, and there has been active research looking at both heuristics and learning-based methods to make the problem computationally tractable. However, while there are dozens of mappers out there (all empirically claiming to find better mappings than others), the research community lacks systematic insights on how different search techniques navigate the map-space and how different mapping axes contribute to the accelerator's performance and efficiency. Such insights are crucial to developing mapping frameworks for emerging DNNs that are increasingly irregular (due to neural architecture search) and sparse, making the corresponding map spaces much more complex. In this work, rather than proposing yet another mapper, we do a first-of-its-kind apples-to-apples comparison of search techniques leveraged by different mappers. Next, we extract the learnings from our study and propose two new techniques that can augment existing mappers -- warm-start and sparsity-aware -- that demonstrate speedups, scalability, and robustness across diverse DNN models

    Bridge scour evaluation based on ambient vibration

    Get PDF
    The vulnerability of bridges to hazards such as earthquakes, wind and floods necessitates special structural characteristics. To guarantee the stability of bridge structures, the precise evaluation of the scour depth of bridge foundation has recently become an important issue, as most of the unexpected damage to or collapse of bridges has been attributed to hydraulic issues. In this paper, a vibration-based bridge health monitoring system that utilizes only the response of superstructure to rapidly evaluate the embedded depth of a bridge column is proposed. To clarify the complex fluid-solid coupling phenomenon, the effects of embedded depth and water level were first verified through a series of static experiments. A confined finite element model simulated by soil spring effects was then established to illustrate the relationship between the fundamental frequency and the embedded depth. Using the proposed algorithm, the health of the bridge is able to be inferred by processing the ambient vibration response of the superstructure. To implement the proposed algorithm, a SHM prototype system monitoring environmental factors such as temperature, water level, and inclination was developed to support on-line processing. The performance of the proposed system was verified by a series of dynamic bridge scour experiments conducted in a laboratory flume and compared with readings from a water-proof camera. The results showed that using the proposed vibration-based bridge health monitoring system, the embedded depth of bridge column during complex scour processes is able to be reliably calculated

    Quantitative measures of functional outcomes and quality of life in patients with C5 palsy

    Get PDF
    AbstractBackgroundIt is generally understood that postoperative C5 palsy can occur with anterior or posterior decompression surgery, but functional measures of the palsy have not been well documented. This study aimed to investigate the incidence of C5 palsy in different surgical procedures, examine the correlations between muscle strength, upper extremity functional measures, and health-related quality of life, and to observe potential risk factors contributing to C5 palsy.MethodsOur investigation involved a retrospective study design. A total of 364 patients who underwent decompression surgery were indicated within the selected exclusion criteria. Additionally, 12 C5 palsy patients were recruited. The relationships between the manual muscle test (MMT), the action research arm test (ARAT), the Jebsen test of hand function (JTHF), and the European quality of life-5 dimensions (EQ-5D) were studied, and univariate analyses were performed to search possible risk factors and recovery investigation.ResultsThe data analyzed in the 12 cases and C5 palsy incidences (3.3%) were: 0.7% in anterior procedures (n = 2), 8.8% in posterior procedures (n = 6), and 36.4% in combined procedures (n = 4). Moderate-to-high correlations were observed between the ARAT, JTHF, EQ-5D visual analog scale scores, and MMT (r = 0.636–0.899). There were significant differences in patient age, etiology of cervical lesion, variable decompression procedures, and the number of decompression levels between the C5 palsy and non-C5 palsy groups. For female patients (p = 0.018) and number of decompression levels (p = 0.028), there were significant differences between the complete recovery and the incomplete recovery groups.ConclusionPatients undergoing combined anterior–posterior decompression surgery had the highest incidence of C5 palsy, and correlations between the ARAT, JTHF, EQ-5D visual analog scale clinical tools, and MMT scores supported these findings. Female status and lower decompression levels could also be predictive factors for complete recovery, although additional research is needed to substantiate these findings

    Organic Electrochemical Transistors/SERS-Active Hybrid Biosensors Featuring Gold Nanoparticles Immobilized on Thiol-Functionalized PEDOT Films

    Get PDF
    In this study we immobilized gold nanoparticles (AuNPs) onto thiol-functionalized poly(3,4-ethylenedioxythiophene) (PEDOT) films as bioelectronic interfaces (BEIs) to be integrated into organic electrochemical transistors (OECTs) for effective detection of dopamine (DA) and also as surface-enhanced Raman scattering (SERS)—active substrates for the selective detection of p-cresol (PC) in the presence of multiple interferers. This novel PEDOT-based BEI device platform combined (i) an underlying layer of polystyrenesulfonate-doped PEDOT (PEDOT:PSS), which greatly enhanced the transconductance and sensitivity of OECTs for electrochemical sensing of DA in the presence of other ascorbic acid and uric acid metabolites, as well as amperometric response toward DA with a detection limit (S/N = 3) of 37 nM in the linear range from 50 nM to 100 μM; with (ii) a top interfacial layer of AuNP-immobilized three-dimensional (3D) thiol-functionalized PEDOT, which not only improved the performance of OECTs for detecting DA, due to the signal amplification effect of the AuNPs with high catalytic activity, but also enabled downstream analysis (SERS detection) of PC on the same chip. We demonstrate that PEDOT-based 3D OECT devices decorated with a high-density of AuNPs can display new versatility for the design of next-generation biosensors for point-of-care diagnostics

    Integration, Launch, and First Results from IDEASSat/INSPIRESat-2 - A 3U CubeSat for Ionospheric Physics and Multi-National Capacity Building

    Get PDF
    The Ionospheric Dynamics and Attitude Subsystem Satellite (IDEASSat) is a 3U CubeSat carrying a Compact Ionospheric Probe (CIP) to detect ionospheric irregularities that can impact the usability and accuracy of global satellite navigation systems (GNSS), as well as satellite and terrestrial over the horizon communications. The spacecraft was developed by National Central University (NCU) in Taiwan, with additional development and operational support from partners in the International Satellite Program in Science and Education (INSPIRE) consortium. The spacecraft system needed to accommodate these mission objectives required three axis attitude control, dual band communications capable of supporting both tracking, telemetry and command (TT&C) and science data downlink, as well as flight software and ground systems capable of supporting the autonomous operation and short contact times inherent to a low Earth orbit mission developed on a limited university budget with funding agency-imposed constraints. As the first spacecraft developed at NCU, lessons learned during the development, integration, and operation of IDEASSat have proven to be crucial to the objective of developing a sustainable small satellite program. IDEASSat was launched successfully on January 24, 2021 aboard the SpaceX Falcon 9 Transporter 1 flight. and successfully began operations, demonstrating power, thermal, and structural margins, as well as validation of uplink and downlink communications functionality, and autonomous operation. A serious anomaly occurred after 22 days on orbit when communication with the spacecraft were abruptly lost. Communication was re-established after 1.5 months for sufficient time to downlink stored flight data, which allowed the cause of the blackout to be identified to a high level of confidence and precision. In this paper, we will report on experiences and anomalies encountered during the final flight model integration and delivery, commissioning, and operations. The agile support from the international amateur radio community and INSPIRE partners were extremely helpful in this process, especially during the initial commissioning phase following launch. It is hoped that the lessons learned reported here will be helpful for other university teams working to develop spaceflight capacity

    Search for new phenomena in final states with an energetic jet and large missing transverse momentum in pp collisions at √ s = 8 TeV with the ATLAS detector

    Get PDF
    Results of a search for new phenomena in final states with an energetic jet and large missing transverse momentum are reported. The search uses 20.3 fb−1 of √ s = 8 TeV data collected in 2012 with the ATLAS detector at the LHC. Events are required to have at least one jet with pT > 120 GeV and no leptons. Nine signal regions are considered with increasing missing transverse momentum requirements between Emiss T > 150 GeV and Emiss T > 700 GeV. Good agreement is observed between the number of events in data and Standard Model expectations. The results are translated into exclusion limits on models with either large extra spatial dimensions, pair production of weakly interacting dark matter candidates, or production of very light gravitinos in a gauge-mediated supersymmetric model. In addition, limits on the production of an invisibly decaying Higgs-like boson leading to similar topologies in the final state are presente

    Rationalization and Design of the Complementarity Determining Region Sequences in an Antibody-Antigen Recognition Interface

    Get PDF
    Protein-protein interactions are critical determinants in biological systems. Engineered proteins binding to specific areas on protein surfaces could lead to therapeutics or diagnostics for treating diseases in humans. But designing epitope-specific protein-protein interactions with computational atomistic interaction free energy remains a difficult challenge. Here we show that, with the antibody-VEGF (vascular endothelial growth factor) interaction as a model system, the experimentally observed amino acid preferences in the antibody-antigen interface can be rationalized with 3-dimensional distributions of interacting atoms derived from the database of protein structures. Machine learning models established on the rationalization can be generalized to design amino acid preferences in antibody-antigen interfaces, for which the experimental validations are tractable with current high throughput synthetic antibody display technologies. Leave-one-out cross validation on the benchmark system yielded the accuracy, precision, recall (sensitivity) and specificity of the overall binary predictions to be 0.69, 0.45, 0.63, and 0.71 respectively, and the overall Matthews correlation coefficient of the 20 amino acid types in the 24 interface CDR positions was 0.312. The structure-based computational antibody design methodology was further tested with other antibodies binding to VEGF. The results indicate that the methodology could provide alternatives to the current antibody technologies based on animal immune systems in engineering therapeutic and diagnostic antibodies against predetermined antigen epitopes
    corecore